Speech/Non-Speech Detection for Electro-Larynx Speech Using EMG
نویسندگان
چکیده
Electro-larynx speech (EL) is a possibility to re-obtain speech when the larynx is surgically removed or damaged. As currently available devices normally are hand-held, a new generation of EL devices would benefit from a hands-free version. In this work we use electromyographic (EMG) signals to investigate speech/nonspeech detection for EL speech. The muscle activity, which is represented by the EMG signal, correlates with the intention to produce speech sounds and therefore, the short-term energy can serve as a feature to make a speech/non-speech decision. We developed a data acquisition hardware to record EMG signals using surface electrodes. We then recorded a small database with parallel recordings of EMG and EL speech and used different approaches to classify the EMG signal into speech/non-speech sections. We compared the following envelope calculation methods: root mean square, Hilbert envelope, and low-pass filtered envelope, and different classification methods: single threshold, double threshold and a Gaussian mixture model based classification. This study suggests that the results are speaker dependent, i.e. they strongly depend on the signal-to-noise ratio of the EMG signal. We show that using low-pass filtered envelope together with double threshold detection outperforms the rest.
منابع مشابه
Synthesizing speech from electromyography using voice transformation techniques
Surface electromyography (EMG) can be used to record the activation potentials of articulatory muscles while a person speaks. This technique could enable silent speech interfaces, as EMG signals are generated even when people pantomime speech without producing sound. Having effective silent speech interfaces would enable a number of compelling applications, allowing people to communicate in are...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملDirect Speech Generation for a Silent Speech Interface based on Permanent Magnet Articulography
Patients with larynx cancer often lose their voice following total laryngectomy. Current methods for post-laryngectomy voice restoration are all unsatisfactory due to different reasons: requires frequent replacement due to biofilm growth (tracheo-oesoephageal valve), speech sounds gruff and masculine (oesophageal speech) or robotic (electro-larynx) and, in general, are difficult to master (oeso...
متن کاملمقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی
Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کامل